To really assess if you like an ordination, we will compare the
model-based ordinations to a few classical methods. For this we need the
procrustes function in the vegan package, because
that tells us how different two ordinations. We also need other
ordination methods of course, which we can get from the same
package.
I collected some datasets that we can work with, but if you have your own data you can start by analyzing that instead. mvabund has a few datasets included that we can play with:
and there are more datasets (e.g., “dune”, “pyrifos”, “mite”, and “BCI”) in the vegan R-package. These are even more datasets in the “data” folder of the workshop that you can use for this practical:
My suggestion is to try a few different data types in this exercise (e.g., presence-absence (alpine, eucalypt), ordinal (dune, Skabbholmen), abundance (take your pick), biomass (wadden) to get an impression of what it takes to analyse such datatypes. For some of these response types (e.g., ordinal, biomass) it might be more difficult to find software for multispecies modeling that also supports a suitable response distribution (don’t worry, gllvm has it all).
Start by choosing a dataset, I will again start with the waddensea (abundance) data.
Y <- read.table("../../data/waddenY.csv", sep="," ,header=TRUE, row.names = 2)[,-1]
I start by fitting the same model as I did in the last exercise, and directly plot it:
library(gllvm)
## Loading required package: TMB
##
## Attaching package: 'gllvm'
## The following object is masked from 'package:stats':
##
## simulate
model1 <- gllvm(Y, num.lv = 2, family = "poisson")
gllvm::ordiplot(model1, biplot = TRUE)
and I fit a DCA and a NMDS to the same data:
library(vegan)
DCA <- decorana(Y)
NMDS <- metaMDS(Y, trymax = 1000,try=100)
plot(DCA)
plot(NMDS)
Now we can use the procrustes error, which accounts for differences in scale and rotation, to assess if these ordinations are similar. At one, the rotations are the same, and at zero they are completely different.
procrustes(scores(DCA, choices=1:2), scores(NMDS), symmetric=TRUE)
##
## Call:
## procrustes(X = scores(DCA, choices = 1:2), Y = scores(NMDS), symmetric = TRUE)
##
## Procrustes sum of squares:
## 0.4575
procrustes(scores(DCA, choices=1:2), getLV(model1), symmetric=TRUE)
##
## Call:
## procrustes(X = scores(DCA, choices = 1:2), Y = getLV(model1), symmetric = TRUE)
##
## Procrustes sum of squares:
## 0.7403
procrustes(scores(NMDS), getLV(model1), symmetric=TRUE)
##
## Call:
## procrustes(X = scores(NMDS), Y = getLV(model1), symmetric = TRUE)
##
## Procrustes sum of squares:
## 0.7531
Note that NMDS does not possess species loadings, so we just compare everything based on the sites. The model-based ordination is more different from the DCA and NMDS than they are from each other. We can of course just throw in the towel, declare DCA and NMDS poor methods, and continue with our model-based ordination. However, I never checked if the Poisson distribution is actually a decent choice for this data, so let’s have a look at the residuals of the model first:
plot(model1)
there are a lot of patterns in the residuals, indicating that the model is not a great fit. That might explain why the ordinations are so different as well. We can adjust the model; we choose a negative-binomial distribution instead and check the residuals again to see if it is an improvement.
model2 <- gllvm(Y, num.lv = 2, family = "negative.binomial")
plot(model2)
This mode looks considerably better, since there are no patterns in the residuals. We can compare it to the first model with Poisson distribution:
## df AIC
## model1 173 36664.87
## model2 231 16777.35
procrustes(getLV(model1), getLV(model2), symmetric = TRUE)
##
## Call:
## procrustes(X = getLV(model1), Y = getLV(model2), symmetric = TRUE)
##
## Procrustes sum of squares:
## 0.3234
##
## Call:
## procrustes(X = scores(DCA, chouces = 1:2), Y = getLV(model2), symmetric = TRUE)
##
## Procrustes sum of squares:
## 0.5504
procrustes(scores(NMDS), getLV(model2), symmetric = TRUE)
##
## Call:
## procrustes(X = scores(NMDS), Y = getLV(model2), symmetric = TRUE)
##
## Procrustes sum of squares:
## 0.5365
Like this, the model-based ordination is already more similar to the DCA and NMDS ordinations, almost as similar as they are to each other. Ultimately, we do not need the ordinations to be similar at all, and we should not go through a model selection procedure trying to find an ordination that is similar to a classical ordination method. Classical ordination methods have their own quircks, and model-based ordination is trying to do something very different. So, if we do find an ordination that is very different from one of the classical methods, that can actually be a good thing! What we do want, is a model that fits our data well, so we can look at the residuals, think about the properties of our data, and compare models to find the best model-based ordination that we can find.
Suggestions for continuing this practical:
procrustes